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Abstract — A novel class of bit-flipping (BF) algorithms for 
decoding low-density parity-check (LDPC) codes is presented. 
The proposed algorithms, which are called gradient descent bit 
flipping (GDBF) algorithms, can be regarded as simplified gradi- 
ent descent algorithms. Based on gradient descent formulation, 
the proposed algorithms are naturally derived from a simple 
non-linear objective function. 

f : Nagoya Institute of Technology, |f : Meijo University. 



I. Introduction 

Bit-flipping (BF) algorithms for decoding low-density 
parity-check (LDPC) codes [1] have been investigated exten- 
sively and many variants of BF algorithms such as weighted 
BF (WBF) [2], modified weighted BF (MWBF) [3], and 
other variants [4], [5], [6] have been proposed. The first 
BF algorithm was developed by Gallager [1]. In a decoding 
process of Gallager's algorithm, some unreliable bits (in a 
binary quantized received word) corresponding to unsatisfied 
parity checks are flipped for each iteration. The successors 
of Gallager's BF algorithm inherits the basic strategy of 
Gallager's algorithm, namely, find unreliable bits and then flip 
them. Although the bit error rate (BER) performance of the BF 
algorithm is inferior to that of the sum-product algorithm or the 
min-sum algorithm, in general, the BF algorithm enables us to 
design a much simpler decoder, which is easier to implement. 
Thus, bridging the performance gap between BF decoding and 
BP decoding is an important technical challenge. 

In the present paper, a novel class of BF algorithms for 
decoding LDPC codes is presented. The proposed algorithm, 
which are called gradient descent bit flipping (GDBF) al- 
gorithms, can be regarded as bit-flipping gradient descent 
algorithms. The proposed algorithms are naturally derived 
from a simple gradient descent formulation. The behavior of 
the proposed algorithm can be explained from the viewpoint 
of the optimization of a non-linear objective function. 



II. Preliminaries 



A. Notation 



Let H he a binary m x n parity check matrix, where n > 
m > 1. The binary linear code C is defined by C = {c e Fj" : 
He ~ 0}, where F2 denotes the binary Galois field. In the 
present paper, a vector is assumed to be a column vector. For 
convention, we introduce the bipolar codes C corresponding 
to C as follows: C = {(l-2ci, l-2c2, . . . , l-2c„) : c e C}. 



Namely, C, which is a subset of {+1, —1}", is obtained from 
C by using binary (0, 1) to bipolar (+1, —1) conversion. 

The binary-input AWGN channel is assumed in the paper, 
which is defined by y = c + z (c G C). The vector 
z ~ (zi,...,z„) is a white Gaussian noise vector where 
Zj{j G [IjJT-]) is an i.i.d. Gaussian random variable with zero 
mean and variance a^. The notation [a,b] denotes the set of 
consecutive integers from a to b. 

Let N{i) and M{j){i £ e be N{i) = {j e 



[l,n] 



A 



1} and M{j) = {i £ [l,m 



1} where 



hij is the ij-element of the parity check matrix H. Using this 
notation, we can write the parity condition as: njgN(i) — 
l(Vi e which is equivalent to (xi,...,a;„) G C. 



The value rijeJVC*) 
syndrome of x. 



e {+1,-1} is called the i-th bipolar 



B. Brief review on known BF algorithms 

A number of variants of BF algorithms have been devel- 
oped. We can classify the BF algorithms into two-classes: 
single bit flipping (single BF) algorithms and multiple bits 
flipping (multi BF) algorithms. In the decoding process of the 
single BF algorithm, only one bit is flipped according to its 
bit flipping rule. On the other hand, the multi BF algorithm 
allows multiple bit flipping per iteration in a decoding process. 
In general, although the multi BF algorithm shows faster 
convergence than the single BF algorithm, the multi BF 
algorithm suffers from the oscillation behavior of a decoder 
state, which is not easy to control. The framework of the single 
BF algorithms is summarized as follows: 



A 



Single BF algorithm 

Step 1 For j £ let Xj :— sign(yj). Let x = 

Step 2 If the parity equation njeiv(i) ^ '^^ holds 

for all i G [1, Tt], output x, and then exit. 
Step 3 Find the flip position given by 



are; min Ak(x). 

ke\l,n] 



(1) 



and then flip the symbol: X£ := —X£. The 
function Ak{x) is called an inversion function. 
Step 4 If the number of iterations is under the maxi- 
mum number of iterations Lmax, return to Step 
2; otherwise output x and exit. 
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The function sign( ) is defined by 



sign(x) = 



+1, 
-1, 



x>0 
X <0. 



(2) 



In a decoding process of the single BF algorithm, hard decision 
decoding for a given y is first performed, and x is initialized 
to the hard decision result. The minimum of the inversion 
function Ak{x) for k G [IjJt^] is then founcOl An inversion 
function Ak{x) can be seen as a measure of the invalidness 
of bit assignment on x^- The bit X(, where £ gives the smallest 
value of the inversion function, is then flipped. 

The inversion functions of WBF [2] are defined by 



a'-^^^\x)^ 



E 

ieM(k) 



(3) 



ft n 

jeNii) 

The values f3i{i 6 [Ij'ti]) is the reliability of bipolar syn- 
dromes defined by f3i — min^g^vfj) In this case, the in- 
version function A^^^^"* (x) gives the measure of invalidness 
of symbol assignment on Xk, which is given by the sum of 
the weighted bipolar syndromes. 

The inversion functions of MWBF [3] has a similar form 
of the inversion function of WBF but it contains a term 
corresponding to a received symbol. The inversion function 
of MWBF is given by 



A 



{MWBF) 



(x) = a\yk\ 



E 

i£M{k) 



ft n 



(4) 



where the parameter a is a positive real number. 

III. Gradient descent formulation 

A. Objective function 

It seems natural to consider that the dynamics of a BF algo- 
rithm as a minimization process of a hidden objective function. 
This observation leads to a gradient descent formulation of BF 
algorithms. 

The maximum likelihood (ML) decoding problem for the 
binary AWGN channel is equivalent to the problem of finding 
a (bipolar) codeword in C, which gives the largest correlation 
to a given received word y. Namely, the MLD rule can be 
written as a; = argmax^^^j ^"^^ xiiji. 

Based on this correlation decoding rule, we here define the 
following objective function: 



/(-) = E 



^iVj + l^ 11 ^j- (5) 

4=1 ■!=! jeN{i) 

The first term of the objective function corresponds to the 
correlation between a bipolar codeword and the received word, 
which should be maximized. The second term is the sum of the 
bipolar syndromes of x. If and only if a; G C, then the second 
term has its maximum value X^i^i njeiv(i) ^ Thus, 
this term can be considered as a penalty term, which forces 
a; to be a valid codeword. Note that this objective function 
is a non-linear function and has many local maxima. These 
local maxima become a major source of sub-optimality of the 
GDBF algorithm presented later. 

'when Afe(£c) is an integer-valued function, we need a tie-brealc rule to 
resolve a tie. 



B. Gradient descent BF algorithm 

For the numerical optimization problem for a differentiable 
function such as (|5]l, the gradient descent method [8] is a 
natural choice for the first attempt. The partial derivative 



of f{x) with respect to the variable Xk{k G [1, 
immediately derived from the definition of f{x): 

d 



can be 



dxk 



E n (6) 

iGM{k) j€N{i)\k 

Let us consider the product of Xk and the partial derivative 
of Xk in X, namely 

d 



OXk 



XkUk 



E 



n 



(7) 



For a small real number s, we have the first-order approxima- 
tion: 

d 



,Xk+ s,. 



OXk 



(8) 



When -g^f{x) > 0, we need to choose s > in order to 
have 

On the other hand, if 



Xk+ S,.. .,Xn) > f{x). 



(9) 



-ir—f(x) < holds, we should choose 

dxk J y ' 

s < to obtain the inequality (|9]). Therefore, if Xk-^r^f{x) < 
0, then flipping the fcth symbol (xk ~Xk) may increase the 
objective function valuq5 

One reasonable way to find a flipping position is to choose 
the position at which the absolute value of the partial derivative 
is largest. This flipping rule is closely related to the steepest 
descent algorithm based on £i-norm (also known as the coor- 
dinate descent algorithm) [8]. According to this observation, 
we have the following rule to choose the flipping position. 

Definition 1 (Inversion function of the GDBF algorithm): 
The single BF algorithm based on the inversion function 



A 



(10) 



f'^\x) = xkyk+ Yl n 
ieM(k) jeN(i) 

is called the Gradient descent BF (GDBF) algorithm. q 
Thus, the decoding process of the GDBF algorithm can be seen 
as the minimization process of —f{x) (it can be considered as 
the energy of the system) based bit-flipping gradient descent 
method. 

It is interesting to see that the combination of the objective 
function f{x) defined by 



fix) ^aJ^^Xjlvjl 



i=l 



1=1 



;ft n (11^ 

jeN(i) 

and the argument on gradient descent presented above gives 
the inversion functions of conventional algorithms such as the 
WBF algorithm Q and the MWBF algorithm However, 
this objective function ( fTTT ) looks less meaningful compared 
with the objective function (|5]). In other words, the inversion 
function A^'^^'' (x) defined in (fTOl i has a more natural interpre- 
tation than those of the conventional algorithms: A^'^^^^ (x) 

^ There is a possibility that the objective function value may decrease 
because the step size is fixed (such as single flip). 
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in ([3j and A[, ' (x) in (|4]i. Actually, the new inversion 

function A^'^^'(a;) is not only natural but also effective in 
terms of bit error performance and convergence speed. 

C. Multi GDBF algorithm 

A decoding process of the GDBF algorithm can be regarded 
as a maximization process of the objective function (|5]l in 
a gradient ascent manner. Thus, we can utilize the objective 
function value in order to observe the convergence behavior. 
For example, it is possible to monitor the value of the objective 
function for each iteration. In the first several iterations, 
the value increases as the number of iterations increases. 
However, the value eventually ceases to increase when the 
search point arrives at the nearest point in {+1, —1}" to the 
local maximum of the objective function. We can easily detect 
such convergence to a local maximum by observing the value 
of the objective function. 

Both the BF algorithms reviewed in the previous section 
and the GDBF algorithm flip only one bit for each iteration. 
In terms of the numerical optimization, in these algorithms, 
a search point moves towards a local maximum with a very 
small step (i.e., 1 bit flip) in order to avoid oscillation around 
the local maximum (See FiglT] (A)). However, the small size 
step leads to slower convergence to a local maximum. In 
general, compared with the min-sum algorithm, BF algorithms 
(single flip/iteration) require a larger number of iterations to 
achieve the same bit error probability. 

The multi bit flipping algorithm is expected to have a faster 
convergence speed than that of the single bit flipping algorithm 
because of its larger step size. If the search point is close to 
a local maximum, a fixed large step is not suitable to find the 
(near) local maximum point; it leads to oscillation behavior 
of a multi-bit flipping BF algorithm (FiglUB)). We need to 
adjust the step size dynamically from a large step size to a 
small step size in an optimization process (FiglllC)). 



Local maximum 




(A) Single flipping (B) Multiple flipping (C) IVIultiple flipping 

(fixed) (dynamic) 

(A) converging but slow, (B) not converging but fast, (C) 
converging and fast 



Fig. 1. Convergence behavior 

The objective function is a useful guideline for adjusting 
the step size (i.e., number of flipping bits). The multi GDBF 
algorithm is a GDBF algorithm including the multi-bit flipping 
idea. In the following, we assume the inversion function 
^(a;) defined by ^ (the inversion function for the 
GDBF algorithm). 



The flow of the multi GDBF algorithm is almost the same 
as that of the previously presented GDBF algorithm. When it 
is necessary to clearly distinguish two decoding algorithms, 
the GDBF algorithm presented in the previous sub-subsection 
is referred to as the single GDBF algorithm. 

In order to define the multi GDBF algorithm, we need 
to introduce new parameters 9 and (i. The parameter 6* is a 
negative real number, which is called the inversion threshold. 
The binary (0 or 1) variable /i, which is called the mode flag, 
is set to at the beginning of the decoding process. Step 3 
of the BF algorithm should be replaced with the following 
multi-bit flipping procedure. 



Usually, at the beginning of a decoding process, the ob- 
jective function value increases as the number of iterations 
increases in the multi-bit mode, namely, /i < /2 holds for the 
first few iterations. When the search point eventually arrives 
at the point satisfying /i > /2, the bit flipping mode is 
changed from the multi-bit mode (/i = 0) to the single-bit 
mode (/I = 1). This mode change means adjustment of the 
step size, which helps a search point to converge to a local 
maximum when the search point is located close to the local 
maximum. 

IV. Behavior of the GDBF algorithms 

In this section, the behavior and decoding performance of 
(single and multi) GF-BF algorithms obtained from computer 
simulations are presented. 

Figure |2] presents objective function values (|5]l as a function 
of the number of iterations in the single and multi GDBF 
processes. Throughout the present paper, a regular LDPC code 
with n = 1008, TO = 504 (cafled PEGReg504xl008 in [9]) is 
assumed. The column weight of the code is 3. In both cases 
(single and multi), we tested the same noise instance, and 
both algorithms output the correct codeword (i.e., successful 
decoding). 

In the case of the single GDBF-algorithm, the objective 
function value gradually increases as the number of iterations 
grows in the first 50-60 iterations. After the slope, the incre- 
ment of the objective function value eventually stops, and a 
flat part that corresponds to a local maximum appears. In the 
flat part of the curves, the oscillation behavior of the objective 



Step 3 Evaluate the value of the objective function, 
and let /i := f{x). If /i = 0, then execute Sub- 
step 3-1 (multi-bit mode), else execute Sub- 
step 3-2 (single-bit mode). 
3-1 Flip all the bits satisfying 

Af^)<0 (fce[l,n]). 

Evaluate the value of the objective 
function again, and let /2 := f{x). 
If fi > /2 holds, then let ^ = 1. 
3-2 Flip a single bit at the jth position, 
where 

.A . . (GD) 

1 = are: mm A 

^fce[l.n] 
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function value can be seen. Due to the constraint such that 
a search point x must lie in {+1, —1}, a GDBF process 
cannot find a true local maximum point (the point where the 
gradient of the objective function becomes a zero vector) of the 
objective function. Thus, a search point moves around the local 
maximum point. This move causes the oscillation behavior 
observed in a single GDBF process. The curve corresponding 
to the multi GDBF algorithm shows much faster convergence 
compared with the single GDBF algorithm. It takes only 15 
iterations for the search point to come very close to the local 
maximum point. 
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Fig. 2. Objective function values in GDBF processes as a function of the 
number of iterations 



Figure [3] presents the bit error curves of single and multi 
GDBF algorithms {L,nax = 100, 9 = —0.6). As references, 
the curves for the WBF algorithm (Lmax = 100), the MWBF 
algorithms (Lmax ~ 100, a = 0.2), and the normalized min- 
sum algorithm {Lmax = 5, scale factor 0.8) are included 
as well. The parameter L„iax denotes the maximum number 
of iterations for each algorithm. We can see that the GDBF 
algorithms perform much better than the WBF and MWBF 
algorithms. For example, at BER — 10^^, the multi GDBF 
algorithm offers a gain of approximately 1.6 dB compared 
with the MWBF algorithm. Compared with the single GDBF 
algorithm, the multi GDBF algorithm has a steeper slope in its 
error curve. Unfortunately, there is still a large performance 
gap between the error curves of the normalized min-sum 
algorithm and the GDBF algorithms. The GDBF algorithm 
fails to decode when a search point is attracted to an unde- 
sirable local maximum of the objective function. This large 
performance gap suggests the existence of some local maxima 
relatively close to a bipolar codeword, which degrades the 
BER performance. 

Figure |4] shows error curves for an irregular LDPC code. 
The code used in the simulation is an irregular code (called 
PEGirReg504xl008 in [9]) consti'ucted based on PEG con- 
struction. The same decoding algorithms (with same param- 
eter) appeared in Fig|3] have been tested. As well as the 
regular case, the error curves of GD-BF algorithms come 
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Fig. 3. Bit error rate of GDBF algorithms: regular LDPC code 
(PEGReg504xl008[9]) 



bellow those of WBF and MWBF algorithms. However, the 
improvement is relatively small compared with the regular 
case. This observation may imply that the advantage of GD-BF 
algorithm in BER depends on type of the code. 



WBF, Lmax=100 
MWBF, Lmax=100 
single GD-BF, Lmax=100 
multi GD-BF, Lmax=100 
Normalized min-sum, Lmax=5 




Fig. 4. Bit error rate of GDBF algorithms: irregular LDPC code 
(PEGirReg504xl008[9]) 



In order to evaluate the convergence speed of BF algorithms, 
the average number of iterations is an appropriate measure. 
Figure |5] shows the average number of iterations (as a function 
of SNR) of the GDBF algorithms (single and multi), the WBF 
algorithm, and the MWBF algorithms. Note that the multi 
GDBF algorithm certainly have a fast convergence property. 
Large gaps can be observed between the curve of the multi 
GDBF algorithm and the other curves. 

V. Escape from a local maximum 

A. Effect of non-codeword local maxima 

As we have discussed, a decoding failure occurs when a 
search point is captured by a local maximum, which is not a 
transmitted codeword. Thus, it is desirable to know the effect 
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Fig. 5. Average number of iterations 



of such local maxima. Figure |6] presents three trajectories 
of weight and syndrome weight of a search point in three 
decoding processes corresponding to decoding failure. The 
weight of a search point x is defined by wi{x) — \{j G 
[1, n] : Xj = — l}j. In a similar way, the syndrome weight of x 



A 



is given by 'W2{x) 
assume that the all-1 



|i e [l,m] : n 



jeN{i) 



-1 



We 



bipolar codeword (i.e., all-zero binary 
codeword) is transmitted without loss of generality. 

We can obtain the following observation from Fig|6] (i) the 
decoding process starts from the position at which both wi{x) 
and 'W2{x) are large, (ii) wi{x) and 'W2{x) decreases as the 
iteration proceeds, and (iii) the final states of the search point 
have a relatively small value of wi{x) and W2{x). 




15 20 25 

Weight of X 



Fig. 6. Trajectories of weiglit and syndrome weiglit of searcli points 

Based on these observations, we may be able to conjecture 
that a search point is finally trapped by a local maximum close 
to a near codeword in high probability!! Near codewords [10] 
are bipolar codewords of C that have both small weight and 



small syndrome weight. The sub-optimaUty of BF-algorithms 
compared with sum-product and min-sum algorithms comes 
from the effect of these numerous local maxima. 



B. GDBF algorithm with escape process 

Since the weight of the final position of a search point 
is so small, a small perturbation of a captured search point 
appears to be helpful for the search point to escape from 
an undesirable local maximum. We can expect that such a 
perturbation process improves the BER performance of BF 
algorithms. 

One of the simplest ways to add a perturbation on a trapped 
search point is to switch the flip mode from the single-bit 
mode to the multi-bit mode with an appropriate threshold 
forcibly when the search point arrives at a non-codeword local 
maximum. This additional process is called the escape process. 
In general, the escape process reduces the object function 
value, i.e., the search point moves downwards in the energy 
landscape. After the escape process, the search point again 
begins to climb a hill, which may be different from the trapped 
point. 

We here modify the multi GDBF algorithm by incorporating 
two thresholds: 9i and 6*2 ■ The threshold 6i is the threshold 
constant used in the multi-bit mode at the beginning of the 
decoding process. After several iterations, the multi-bit mode 
is changed to single-bit mode and then the search point may 
eventually arrive at the non-codeword local maximum. In such 
a case, the decoder changes its mode to the multi-bit mode 
(i.e., /i = 0) with threshold 02- Thus, the threshold 6*2 can be 
regarded as the threshold for downward movement. Although 
92 can be a constant value, in terms of the BER performance, 
it is advantageous to choose randomljH In other words, 62 
can be a random variable. After the downward move (just one 
iteration), the decoder changes the threshold to 9i again. The 
above process continues until the parity check condition holds 
or the number of iterations becomes L^ax- Figure |7] illustrates 
the idea of the escape process. 

Transmitted 
codeword 



Trapped 
search point 

1^ Downward 
Single-bit move 
mode / / 





Fig. 7. Idea of escape process 



Note tliat otlier experiments also support tliis conjecture. 



This fact is observed from some experiments. 
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C. Simulation results 

Figure [8] shows the BER curve of such a decoding algorithm 
(labeled 'multi GDBF with escape'). In this simulation, we 
used the parameters: Oi — —0.7, 82 = 1.7 + a where a is a 
Gaussian random number with mean zero and variance 0.01. 
These parameters have been obtained an ad hoc optimization at 
SNR = 4dB. We can see that the BER curve of multi GDBF 
with escape (with Lmax = 300) is much steeper than that 
of the naive multi GDBF algorithm. At BER = 10^^, multi 
GDBF with escape achieves a gain of almost 1 .5 dB compared 
with the naive multi GDBF algorithm. The average number 
of iterations of multi GDBF with escape is approximately 
25.6 at SNR = 4 dB. This result implies that the perturbation 
can actually save some trapped search points to converge to 
the desirable local maximum corresponding to the transmitted 
codeword. It is an interesting open problem to optimize the 
flipping schedule to narrow the gap between the min-sum BER 
curve and the GDBF BER curve. 
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Fig. 8. Bit error rate of the GDBF algorithm with the escape process 



VI. Conclusion 

This paper presents a class of BE algorithms based on the 
gradient descent algorithm. GDBF algorithms can be regarded 
as a maximization process of the object function using bit- 
flipping gradient descent method (i.e., bit-flipping dynamics 
which minimizes the energy —f{x)). The gradient descent 
formulation naturally introduces an energy landscape of the 
state-space of the BF-decoder The viewpoint obtained by this 
formulation brings us a new way to understand convergence 
behaviors of BE algorithms. Furthermore this viewpoint is also 
useful to design improved decoding algorithms such as the 
multi GDBF algorithm and the GDBF algorithm with escape 
process from an undesired local maximum. The GDBF algo- 
rithm with escape process performs very well compared with 
known BE algorithms. One lesson we have learned from this 
result is that fine control on flipping schedule is indispensable 
to improve decoding performance of BE algorithms. 
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